Skip to content

Conversation

@jinchengchenghh
Copy link
Collaborator

@jinchengchenghh jinchengchenghh commented Apr 28, 2025

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 28, 2025
@netlify
Copy link

netlify bot commented Apr 28, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 8d59d01
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68b56368be3a6b000858776f

@jinchengchenghh
Copy link
Collaborator Author

Can you help review? Thanks! @rui-mo @zhli1142015

Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Added two initial comments.

@mbasmanova
Copy link
Contributor

@jinchengchenghh Is there a GitHub issue that provides overall context for this work? If not, would it be possible to create one and explain what are we trying to achive?

@jinchengchenghh
Copy link
Collaborator Author

I create an issue #13980 just recently @mbasmanova

@mbasmanova mbasmanova changed the title feat: Support iceberg bucket function feat(iceberg): Add bucket function Jul 2, 2025
@jinchengchenghh
Copy link
Collaborator Author

Could you help review this PR? Thanks! @rui-mo

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh Let's split this PR into one that adds hash functions and the one that adds BucketFunction.

template <typename TInput>
FOLLY_ALWAYS_INLINE Status
call(int32_t& out, const int32_t& numBuckets, const TInput& input) {
VELOX_RETURN_IF(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some more macros to reduce boiler plate?

We have a throwing version:

VELOX_USER_CHECK_GT(numBuckets, 0, "Invalid number of buckets")

Perhaps, we could add non-throwing version:

VELOX_USER_RETURN_GT(numBuckets, 0, "Invalid number of buckets")

CC: @majetideepak @pedroerp @rui-mo

@jinchengchenghh jinchengchenghh marked this pull request as draft July 3, 2025 13:23
Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova @jinchengchenghh
Murmur3 is the hash function used by Iceberg's partition transform bucket(c,n). Its role is analagous to the hashXXX() functions in HivePartitionFunction. In general, a PartitionFunction for a specific connector is used in the following usage Scenarios:

  1. Data Exchange in Queries (e.g., joins or aggregates):

    • It determines which reducer/task a row should go to using PartitionFunction::partition(), based on the hash of the partition keys. The hash algorithm is different from connector to connector. E.g. Iceberg uses Murmur3, while Hive uses Java's hashCode() function. In most cases it's just simple bit shift and xor.
    • The bucket function ensures bucketed joins align buckets from both tables. Hive bucket function is usually just the mod of the hash value: bucket = hash(key) % num_buckets. This could result in a negative value, while Iceberg uses consitent bucket function that gurantees a positive value.
  2. File Writing:

    • When inserting into a partitioned table, Hive computes the partition number using HivePartitionFunction::partition() based on the algorithm mentioned above.
    • It maps a row’s to a bucket using the connector-specific bucket function based on the bucket calculation method mentioned above.
  3. Connector Specific Functions
    In Iceberg, users can directly query the partition functions, or use them in the filters. E.g. SELECT * FROM t WHERE bucket(c,4) = 3, or SELECT day('2002-01-01'). Note that these functions are Iceberg specific and have their unique semantics. For example, DAY(dt) is the number of days from 1970-01-01, while Presto default DAY(dt) is just the day in that particular month.

THis PR is only touching the 3rd case. However in a more holistic view, we will need to add IcebergPartitionFunction shortly. If we want to keep it aligned with Hive, we may want to put the murmur3 hash function and bucket function in IcebergPartitionFunction (HivePartitionFunction keeps the hash algorithm implementations for HIve). Then when we are registering the iceberg bucket(c,n) function, it can call the hash algorithm implementations from IcebergPartitionFunction. As an alternative, if we want to put all hash algorithms in a central place, we may need to refactor HivePartitionFunction and move all its hash implementation there too. That way we can keep the Hive and Iceberg implementations consistent, and meke the repo more manageable and readable.

I also wonder how we should set up the framework for connector specific functions. E.g. consider catalog validation (ensures these functions run on the correct catalogs, i.e. a user should not be able to run an Iceberg function on Hive tables). Also shall it be in a different "iceberg" namespace? We'd love to hear more suggestions.

@mbasmanova
Copy link
Contributor

@yingsu00 Ying, thank you for sharing detailed context. This is very helpful.

If we want to keep it aligned with Hive, we may want to put the murmur3 hash function and bucket function in IcebergPartitionFunction (HivePartitionFunction keeps the hash algorithm implementations for HIve). Then when we are registering the iceberg bucket(c,n) function, it can call the hash algorithm implementations from IcebergPartitionFunction.

I like this option. It would be helpful to hide the details of the hash computation inside these partition functions and avoid creating classes with duplicate widely-known names (Murmur3) and slightly different implementations.

I also wonder how we should set up the framework for connector specific functions. E.g. consider catalog validation (ensures these functions run on the correct catalogs, i.e. a user should not be able to run an Iceberg function on Hive tables). Also shall it be in a different "iceberg" namespace?

We do not have namespaces in Velox. It might be helpful to add these at some point. In the meantime, we recommend using prefixes.

a user should not be able to run an Iceberg function on Hive tables

Why not?

@jinchengchenghh
Copy link
Collaborator Author

For Data Exchange in Queries, in Gluten, it calls the function as other hive functions in the same way, Gluten does not call Velox Exchange.
Exchange hashpartitioning(staticinvoke(class org.apache.iceberg.spark.functions.BucketFunction$BucketInt, IntegerType, invoke, 3, a#6, IntegerType, IntegerType, false, true, true), 1), REBALANCE_PARTITIONS_BY_COL, [plan_id=758]

@yingsu00
Copy link
Contributor

a user should not be able to run an Iceberg function on Hive tables
Why not?

@mbasmanova Thanks for your information. In my understanding, different table format specifications(Hive, Iceberg, etc) support different functions. They differ in the following aspects:

  1. The set of functions.
  1. Functions with same names may have totally different meanings:
  • Iceberg day(ds) is a date or timestamp day, as days from 1970-01-01: day(“1970-02-01”) = 30
  • Hive day(ds) is the day part of a date or a timestamp string: day(“1970-02-01 00:00:00”) = 1, day(“1970-02-01”) = 1. Velox supported Presto day() function uses this definition.

So my understanding is:

  1. if a user is reading from a Hive table using Prestissimo, then he/she should be able to run Prestissimo built in functions and Hive functions that is already supported by Velox.
  2. if a user is reading from a Iceberg table using Prestissimo, then he/she should be able to run Prestissimo built in functions and Iceberg functions like day(ds). However in this case, the Iceberg day(ds) should be distinguished from Prestissimo built in day() function.

To avoid collisions, it would be ideal for Velox to adopt function namespaces so that, for example, hive.day() and iceberg.day() can coexist unambiguously. Hope this can be supported in the future.

@jinchengchenghh jinchengchenghh force-pushed the iceberg_bucket branch 2 times, most recently from 312015f to 324bee7 Compare August 11, 2025 15:48
@jinchengchenghh jinchengchenghh marked this pull request as ready for review August 11, 2025 15:51
@jinchengchenghh
Copy link
Collaborator Author

Could you help review again? Thanks! @mbasmanova @rui-mo

Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added some nits.

@jinchengchenghh
Copy link
Collaborator Author

Do you have further comments? @rui-mo

velox_exec_test_lib
velox_expression
velox_memory
velox_dwio_common_test_utils
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why is this dependency needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

@jinchengchenghh
Copy link
Collaborator Author

Do you have further comments? Thanks! @mbasmanova

@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Sep 2, 2025
@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this in D81621760.

@jinchengchenghh
Copy link
Collaborator Author

jinchengchenghh commented Sep 5, 2025

Could you help merge this one? Thanks! @Yuhta @mbasmanova

@mbasmanova
Copy link
Contributor

@jinchengchenghh I see that Jimmy is working on merging this PR. Should land soon.

@facebook-github-bot
Copy link
Contributor

@Yuhta merged this pull request in 3f7a211.

nimesh1601 pushed a commit to nimesh1601/velox that referenced this pull request Sep 10, 2025
Summary:
The implementation aligns with https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/transforms/Bucket.java

And described in Iceberg document https://iceberg.apache.org/spec/#partition-transforms

Resolves: facebookincubator#13980

Pull Request resolved: facebookincubator#13174

Reviewed By: kKPulla

Differential Revision: D81621760

Pulled By: Yuhta

fbshipit-source-id: a8051cbb2676a8db0fef95e41c5858004941b7ce

template <typename T>
FOLLY_ALWAYS_INLINE Status call(int32_t& out, int32_t numBuckets, T input) {
VELOX_USER_RETURN_LE(numBuckets, 0, "Invalid number of buckets.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: return Status::UserError("Invalid number of buckets");

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new API VELOX_USER_RETURN_LE, it returns UserError inside the API

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Iceberg functions

7 participants